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Abstract 

The discovery of new objects in modern wide-field asteroid and comet sur- 
veys can be enhanced by first identifying observations belonging to known 
solar system objects. The assignation of new observations to a known ob- 
ject is an attribution problem that occurs when a least squares orbit already 
exists for the object but a separate fit is not possible to just the set of new 
observations. In this work we explore the strongly asymmetric attribution 
problem in which the existing least squares orbit is very well constrained 
and the new data are sparse. We describe an attribution algorithm that 
introduces new quality control metrics in the presence of strong biases in 
the astrometric residuals. The main biases arise from the stellar catalogs 
used in the reduction of asteroid observations and we show that a simple 
debiasing with measured regional catalog biases significantly improves the 
results. We tested the attribution algorithm using data from the PS1 survey 
that used the 2MASS star catalog for the astrometric reduction. We found 
small but statistically significant biases in the data of up to 0.1 arcsec that 
are relevant only when the observations reach the level of accuracy made 
possible by instruments like PS1. The false attribution rate was measured to 
be < 1/1,000 with a simple additional condition that can reduce it to zero 
while the attribution efficiency is consistent with 100%. 
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1. The problem 



When surveying the sky with the purpose of discovering new solar system 
objects known asteroids and comets are also being identified. It is desirable 
to separately treat the identification of the known moving objects for four 
reasons: to avoid claiming as a new discovery some well known object, to 
reduce the dataset while searching for new discoveries, to improve the orbits 
of the known objects while ensuring they are not contaminated by false as- 
sociations and finally, for statistical quality control of the astrometric data 
using the residuals with the well known orbits. 

The procedure of assigning new observations to known objects is aspe- 
cial case of the class of identification problems f jMilani and Gronchil . 12010 . 
Chap. 7). The problem arises because the same object is observed in short 
arcs separated by typically many years. The goal is to build the list of the 
observations belonging to the same physical object without introducing any 
false detections corresponding to another moving object, image defect, sta- 
tistical fluke, or non-solar system object. 

The attribution problem occurs when a least squares orbit has already 
been fit to a set of historical observations of an object and we want to as- 
sign a new set of observations for which it is not possible to independently 
fit a least squares orbit 
tions were described in 



j] The methods for finding and confirming attribu- 



Milani et al.l (120011) and wer e show n to be effective 



for both simulations and rea l data in Milani et al. ( 2005h (s ee also, e.g 



Granyik and Muinonen . 2008 : Sansaturio and Arratial . 201lh . Milani et al 



( 120081 ) then developed a high reliability statistical quality control procedure 
to confirm the attributions. The algorithms developed to date have been suc- 
cessful when the new data sets contain less information but are comparable 
in quantity and quality to the old one. 

In this paper we are interested in the special case in which the previously 
known asteroids have well constrained orbits while the new data to be at- 
tributed are sparse. There is typically just one tracklet, a very short arc of 



1 For a full desc r iption of the terminology see Marsden ( 19851) : Milani ( 19991 ) and 
Milani and Gronchil (|2010l Chap. 7). 



2 



astrometric observations, assembled by the observer using just a linear or a 
quadratic fit to the astrometry as a function of time. The challenge is to 
account for the data asymmetry: a small number of new accurate observa- 
tions per object from the state of the art surveys to be associated with the 
historical data set containing many low accuracy observations per object. 

The main difficulty of the asymmetric identification problem is that both 
the existing and new data are biased by the astrometric catalogs and a reli- 
able astrometric error model (including rigorously estimated RMS and cor- 
relations) is usually not available. As a consequence, the better the orbit 
is constrained by the existing data the worse is the effect of the biases on 
the ephemerides predicted from the orbit and its covariance matrix. In the 
idealized case the existing data may have a larger standard deviation but be 
unbiased (zero average astrometric error) and the observational errors could 
be modeled by a normal distribution with known RMS. 

A quantitative and real example might be more convincing than a the- 
oretical argument. The AstDydH online service tells us that at the current 
date there are 68, 571 numbered or multi- apparition asteroids (76% of those 
with apparent magnitude < 22 and solar elongation > 120°) with a formal 
RMS of the current ephemerides prediction < 0.3 arcsec. Thus, a bias in 
the astrometry of the order of 0.5 arcsec would result in a systematic error 
larger than the estimated random error. W e show that the first of the next 
generation surveys, Pan-STARRSl, (PS1, iHodapp et al.l . 120041 ) generates 
astrometric data with accuracy of 0.1 to 0.15 arcsec — of such high quality 
that if fit to orbits computed with biased historic data the two data sets 
would appear statistically incompatible. Thus, it would be very difficult to 
validate the accuracy of the new data unless debiased data were available 
(both historic and new). 

Our two-step solution is to (i) devise a new statistical quality control 
procedure which is applied asymmetrically to the old and new data and (ii) 
apply a debiasing procedure to the MPC-archived historical observations to 
remove the known position-depe ndent bias due to reg ional systematic errors 
in the astrometric star catalogs (jChesley et al.l . |2010| ). 

We tested our new methods for asymmetric attribution using early results 
from the PS1 telescope. The source orbits used were numbered and multi- 
apparition asteroids. The purpose of the test was to validate our asymmetric 



2 http://hamilton. dm. unipi.it/astdys, as of November 2011. 



3 



attribution method and measure the PS1 system's astrometric accuracy. In 
particular, we were interested in constructing a rigorous PS1 astrometric 
error model taking into account the correlations, as described 

The structure of this paper is as follows. The known algorithms for at- 
tribution are summarized in Section [2j The new algorithms to be applied 
specifically to the asymmetric case are introduced in Section El In Section @] 
we show the importance of the removal of the star catalog biases. In Sec- 
tion [5] we describe the results of two tests of our methods, both using PS1 
data, and a first attempt at building a rigorous statistical error model for 
PS1 astrometry. Section [6] gives the results of the tests for the accuracy and 
efficiency of our algorithms using real data from the PS1 survey. In Section [7] 
we draw some conclusions and indicate possible directions for future work. 



2. The attribution algorithms 

An attribution problem requires that there exist 1) a vector Xi G M 6 
of orbital elements at epoch t\ along with a 6 x 6 covariance matrix T Xl 
describing their uncertainty in the linearized approximation and 2) a vector 
y D G W of observables (s < 6) at another epoch t 2 . Let y c = F(y: 1 , t 2 ) be the 
predicted set of observables at time t 2 using the covariance T Xl to propagate 
the covariance matrix to the space of the observables by the linearized formula 

T yc = DFT^ DF T 

where DF is the s x 6 Jacobian matrix of F computed at xi. Using r yc 
we can assess the likelihood of the prediction y c being compatible with the 
hypothesis that the observation y G belongs to the same object within the 
linear approximation and assuming an unbiased Gaussian error model. 

For reasons of efficiency in handling large numbers of both orbits and 
tracklets this process should be implemented using a sequence of filters as 
described below. Each step provides an increasing likelihood of identification 
with an increa sing computational effort applied to a decreasing number of 



possible cases (jMilani et al.l . l200ll ) . 



2.1. Filter 1: Ob served- Computed residuals on the celestial sphere 

The first test is to compare the new angular observations to the prediction 
on the celestial sphere in right ascension and declination (a, S). The difference 
y — y c = ^y = (Aq, AS) G M 2 is projected on the tangent plane to the 
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celestial sphere. From the 2x2 covariance matrix r yc we compute the normal 
matrix C Yc = and the confidence ellipse 

Z yc (a c ) = {Ay\Ay T C yc Ay < a 2 c } . 

The observation y Q has its own uncertainty given by the 2x2 covariance ma- 
trix r yo and the normal matrix C yo = yielding a second confidence ellipse 
Zy o (a ) for the observed angles. In most cases we can assume the observation 
error is isotropic because the astrometric reduction process typically makes 
no distinction between the two directions. I.e., Z yo can be assumed to be 
just a circle in the (cos5Aa, AS) coordinates. 

This filtering step requires that the two ellipses intersect for reasonable 
values of the confidence parameters a c and a Q . Since this filter may have 
to be applied for each orbit xi to a large set of new observations y Q it is 
often convenient to resort to a simplified test using only the larger of the two 
ellipses. E.g., Ay £ Z yc (a c ) when the uncertainty of the prediction is larger 
and Ay £ Z yo {o~ ) when the observational errors are larger. 

When the orbit xi is already well determined (as assumed in this work) 
the two uncertainties can be comparable. A fast algorithm to handle this case 
is to select a control a c for Z yc (o~ c ) and a circular radius a Q for Z yo (o~ ) and 
then compute the inequality defining an ellipse with the semiaxes of Z yc (o~ c ) 
increased in length by o in the (cos 5Aa, A5) coordinates. 

This algorithm can be applied either to a single observation belonging 
to the set of observations for which attribution is attempted (the tracklet) 
or, better yet, to an average observation obtained from all those forming a 
tracklet because the averaging removes some of the random error. 

2.2. Filter 2: Attributables and attribution penalties 

Tracklets containing observations spanning a short time (typically 15 min 
to 2 hours) can generally be compressed into an attributable vector y = 
(a, 5, a, 5) £ M 4 containing both the angular position and angular motion 
vector. In the special case of objects that are very close to the Earth there 
is additional cu rvature informatio n that may further constrain the range 



and range-rate (IMilani et al.l . 120081 ). The attributable vector is obtained by 
a linear fit to the observations in the tracklet resulting in the observable 
z G £ M 4 at the average time t 2 with a 4 x 4 normal matrix C Zo . 

The "distance" between the observed and predicted attributables is given 
by a metrics that takes into account the uncertainty of both data points using 
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their respective normal matrices. The predicted attributable vector at time 
ti from the state (x 1; tf) is nominally z c G M 4 with a 4 x 4 normal matrix C Zr . 
Its compatibility with the observed attributable vector c an be described by 



the a ttribution penalty K4 (Chap. 7 lMilani and Gronchil . l2010t iMilani et al 



2001|) defined by 



Co — C Zc + C Zo ; To — C 1 

K 4 = (z G - z c ) T [C Zc - C Zc T C Zc ] (z - z c ) 

that corresponds geometrically to testing the intersection of the confidence 
ellipsoids with 04 = y/K 4 playing the role of the confidence parameter. 

We can discard attribution candidates when the attribution penalty ex- 
ceeds a reasonable value of erf. If the predictions were perfectly linear and 
an exact Gaussian error model was available for the observations then a x 2 
table could be used to select <r| but these conditions are never satisfied with 
real data. Thus, the parameter should be determined empirically with sim- 
ulations and/or tests with real data. 

2.3. Filter 3: Differential corrections and Quality control metrics 

Attributions that pass the second filter need to be confirmed by a least 
squares fit with all the available observations. The fit must converge and 
yield residuals with good statistical properties compatible with what we know 
about the observational errors. The most common control is the RMS of the 
residuals weighted with their accuracy. 

The attribution reliability can be increased with statistical quality controls 
based upon more than one parameter that also capture information based on 
systematic signals remaining in the residuals. We normally use the following 
10 metrics: 

• RMS root mean square of the astrometric residual^, 

• BIAS a ,BIASs bias, i.e., average of the astrometric residuals, 

• SPAN a , SPANs first derivatives of the astrometric residuals, 

• CURV a ,CURVs second derivatives of the astrometric residuals, 



'Residuals in a are multiplied by cos(<5) before computing all the metrics. 
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• ZSIGN a} ZSIGNs third derivatives of the astrometric residuals, 

• RMSH RMS of photometric residuals in magnitudes. 

To use the same controls for both historic and new data (that are typically 
of different accuracy) we normalize the bias and derivatives of the residuals 
by fitting them to a polynomial of degree 3 and dividing the coefficients by 
their standard deviation obtained from the covariance matrix of the fit. 
This filter has b een shown to be effective in removing false identifications 



( IMilani et all |2008[ ) using Pan-STARRS survey simulations. The advantage 
of using simulations is that it is then possible to use the ground truth, i.e., 
the a priori knowledge of the correspondence between objects and their ob- 
servations, to verify the identifications. The control parameters can then be 
adjusted to obtain the desired balance between efficiency and accuracy. The 
values of the contro ls selected in this w ay are close to 1 as expected from 



Gaussian statistics (IMilani et al.l . 120081 , Sec. 6) because the orbits are al- 



ready well constrained. Note that the survey which was simulated does not 
correspond exactly to the realized PS1 survey. What matters for our argu- 
ment is that, since the false identifications are proportional to the square of 
the number of tracklets, it was a pseudo-realistic large scale simulation with 
a huge number of 'known objects' and detected tracklets. 



3. Asymmetric attribution quality control 

The algorithms to be used to propose attributions in a strongly asym- 
metric situation (such as high-accuracy numbered asteroid orbits with new 
un-attributed tracklets) can be more or less as described above, but the meth- 
ods to confirm the attributions require additional controls. The main reason 
is the presence of biases in the astrometric errors. 

Figure [1] shows that the normalized biases of astrometric residuals for the 
over-determined orbits of numbered asteroids are strikingly different from a 
zero-mean Gaussian. The normalized declination bias is qualitatively dif- 
ferent from a Gaussian with a mean of ~ 2.17 and a standard deviation of 
~ 1.86. In right ascension the shape is not too different from a normal distri- 
bution but with a non-zero mean of ~ 0.12 and standard deviation of ~ 0.77. 
The span, curvature and Z-sign have similar non-Gaussian distributions al- 
though the declination bias is the worst case. The causes of this behavior 
and the methods to mitigate their effects are discussed in Section HJ 
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Figure 1: Histogram of normalized biases in the astrometric residuals for all num- 
bered asteroids from AstDyS (April 2010). Top: residuals in right ascension (mul- 
tiplied by cos(<5)). Bottom: residuals in declination. 

We need to take into account that the values of the quality metrics are 
already high even before the attribution when attributing new observations 
to objects with strongly over-determined orbits. Thus, we cannot use small 
values of the control parameters because it would generate the paradoxical 
result that the already computed orbit should be discarded before adding new 
observations. Even if the new observations improve the orbit determination 
because they are of better quality and/or extend the observational time span, 
the systematic error signatures contained in the residuals of the existing orbit 
are not going to be removed. On the other hand, if we use high values of the 
controls we may fail to reject some false attributions. 

Our solution is to take into account the values of the metrics and the 
amount they change as a result of the proposed attribution. E.g., for the 
RMS metrics we accept an attribution only if the increase resulting from the 
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attribution is small (we used < 0.15 in Table [2]). For the other 8 metrics that 
are based upon astrometric residuals we set an upper limit to the increase in 
the absolute value (we used < 2 in Table [2]) . 

3.1. Quality control for new observations 

The new observations may have little effect on the statistical proper- 
ties of the complete set of residuals because the tracklets contain typically 
only 2 to 8 observations while the existing data set typically contains tens 
or even hundreds of observations, i.e., the residuals may be neither signifi- 
cantly better nor significantly worse. Thus, we need to separately consider 
the residuals of the attributed observations for which we require the RMS, 
BIAS and SPAN (typical tracklets have no significant curvature and even 
less Z-sign). For the tests reported in Table [2] we have used the controls 
RMS < 3, \BIAS\ < 3, \SPAN\ < 3. These values were chosen empirically 
based on our exper ience with both real data and simulations as reported in 



Milani et al.l (120051 ). Our experience suggests that it is not possible to use 
theoretically motivated control values from e.g., a \ 2 table although they 
must be of the theoretical order. 

The attribution procedure is recursive with tracklets added one by one 
proceeding in order from the most to least likely as measured by the penalty 
K 4 . When an attribution has passed all the quality controls and a new orbit 
has been fit to all the observations including the new tracklet, the new orbit 
becomes the object's reference orbit. Then, the other tracklets proposed for 
attribution from filter 2 are passed to filter 3 and quality control, and so on. 

This procedure is robust but we cannot claim that it is perfectly reliable. 
Due to the stochastic nature of the observational errors any attribution may 
be wrong; even those that result in the determination of a numbered asteroid 
orbit, although the probability of this happening has to be very small. 

Despite the probability being small, when there are a large number of 
trials the odds of finding low probability events approaches unity. Our pro- 
cessing attributed a 4-observation PS1 tracklet from the year 2010 to the 
numbered asteroid (229833) 2009 BQ25; however, the fit to all the avail- 
able data required discarding 6 observations that were designated 2000 KU51 
from the apparition in the year 2000. In this case the only way to decide 
which attribution to (229833) is correct might be to reexamine the original 
observations. 
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4. Data debiasing for historic data 
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Figure 2: Histogram of normalized astrometric residual biases for all numbered 
asteroids as in Figure Q] but after removing systematic star catalog errors from all 
the observations. 

The most important source of systematic errors in observations of so- 
lar system ob j ects is in the star catalog used for astrometric reduction. 
Chesley et al. suggested that the errors could be mitigated by de- 

biasing the astrometric asteroid data with the measured regional catalog 
biases. The biases can be computed as the average difference in position be- 
tween the star s in on e catalog with respect to anoth er more accurate catal og. 
Cheslev et all fl2010h used the 2MASS star catalog flSkrutskie et al.l . l2006h as 
their reference because it is of good accuracy and covers the entire sky with a 
sufficient number of stars per unit area. The catalog used for the astrometric 
reduction of each asteroid observation is available for 92.8% of the existing 
CCD data. 
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We have implemented an error model consistent with lChesley et al. 



by 1) debiasing the observations by subtracting their calculated biases and 
2) assigning a per-observation weight that is i nverse ly proportional to the 



debiased RMS residuals given in IChesley et all ( 120101 . Table 6) (for the same 



observatory and the same catalog when known). More precisely, the weight 
was 1/(2 • RMS) when known but set to 1/1.5 arcsec -1 for data after 1950, 
for observations performed by photographic techniques, and for CCD obser- 
vations that do not include star catalog information. 

Figure |2] shows that the bias distribution is much less asymmetric after 
implementation of the error model. The mean of the normalized biases is 
now ~ 1.00 in declination (standard deviation ~ 1.26) and ~ 0.05 in right 
ascension (standard deviation ~ 0.80). The debiased set of declinations is 
still biased but improves by a factor ~ 2 and the debiased right ascensions has 
biases at the level of the quality of the best catalogs and is as good as it can 
currently be. Thus, the observation debiasing and weighting improves the 
performance of the asymmetric attribution algorithms but highlights that 
improvements in the star catalogs are still necessary to take advantage of 
modern high-accuracy asteroid astrometry. 

Another way to quantify the improvement of the orbit fit is to measure 
the residuals with respect to PS1 observations of numbered asteroids. The 
histograms in Figure |3] show a significant improvement in the declination 
residuals due to the debiasing; the effect is also significant but less pro- 
nounced in right ascension. The debiasing procedure improves the RMS of 
the residuals from 0.21 to 0.18 arcsec in RA and from 0.23 to 0.16 arcsec in 
DEC. 

To take advanta ge of the significant improvements obtained using the 



Chesley et all (120 101 ) debiasing the orbit catalogs used for all the tests de- 
scribed in the next two sections have been computed in this way. The tech- 
nique is now implemented in the free software OrbFit, version 4.2 and lateiQ. 
The debiased orbit catalogs are now available from the AstDyS online service. 

5. Tests with PS1 data 

The Pan-STARRS prototype telescope, PS1, has a 1.8 m diameter pri- 
mary (with an equivalent aperture of 1.55 m after accounting for the large 



'http: / /adams. dm. unipi.it / orbfit 
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Figure 3: Histograms of declination residuals for PS1 observations of numbered 
asteroids that were submitted to the MPC through 15 May 2011. Left: with 
respect to orbits com puted with the biase d star catalog. Right: after applying 
debiasing according to Cheslev et al. ( 20icl ). 



secondary), a 1 .4 G igaPixel camera, and 7.4 square degree field of view 
( Hodapp et all 2004 ). The PS1 sky su rvey is intended to serve multiple 
scientific goals (e.g., in the solar system iJedicke et all 120071 ) for which ac- 
curate astrometry is just one of many necessary requirements. The all-sky 
coverage (from Hawaii) combined with the pixel scale of 0.26 arcsec makes 
PS1 potentially capable of excellent astrometry of solar system objects. 

The PS1 survey began acquiring engineering data in 2009 and the first 
tests with our KNOWN_SERVER software that implements the algorithms 
described herein were run on data taken in June/ July of that year. These 
tests showed that the PS1 astrometry of small solar system objects was ex- 
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tremely good with a standard deviation of ~ 0.12 arcsec in right ascension, 
~ 0.13 arcsec in declination, and a very small bias. (The small bias was due 
to chance as the observed region of the sky had less star catalog errors than 
average as discussed below.) 

PS1 officially began surveying in May 2010 but the first year's data set 
is not fully homogeneous because there were many adjustments to the tele- 
scope, camera, survey scheduling, image processing, etc. Nevertheless, the 
full set of PS1 observations allows us to assess the way the biases change 
with right ascension. We achieve a more detailed analysis of the PS1 system 
performance using a smaller but homogeneous dataset. 

5.1. Biases in PS1 residuals 

We can assess the regional residual systematic errors in the star catalogs 
using PS1 attributions to numbered asteroids observed through April 2011 
that are widely distributed on the celestial sphere. Note that no debiasing 
i s applied for the PS1 astrometry since it already uses the 2MASS catalog 



( ISkrutskie et all 120061) that was used as the reference catalog in the debiasing 



procedure (IChesley et all 120101 ). 

Figure H] shows the number distribution of the observations as a function 
of right ascension and declination. The dramatic dips in the RA number 
distribution at about 105° and 285° are where the galactic plane crosses the 
equator. The PS1 system has a reduced asteroid detection efficiency in the 
galactic plane because of stellar over-crowding. The distribution is also not 
uniform in declination but there are enough data for our purposes over most 
of the range between —30° and +50°. 

The right ascension residual bias is strongly dependent on right ascension 
as shown in the top panel of Figure There is a pronounced maximum of 
more than 0.1 arcsec between 110° and 220°. Note that our first test data 
set of June/ July 2009 was taken between 270° and 300° of right ascension 
where the biases appear to be only around 0.02 arcsec. On the contrary, 
there is only a weak dependence of the declination residual biases upon right 
ascension. 

The location of the dips to ~ 0.00 arcsec for the declination residual (see 
bottom panel of Figure |5]) match the ranges in RA with limited asteroid 
statistics as shown in Figure 0] — where the galactic plane crosses the equa- 
tor. Indeed, Figure |6] highlights the relationship between the biases of the 
residuals and galactic latitude. It is absolutely clear that where the stellar 
sky-plane density is high on the galactic equator the residuals disappear. 
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Figure 4: (Top) Right ascensions and (bottom) declinations of PS1 observations attributed 
to numbered asteroids up to April 2011. The peak in the bin just above 0° declination is 
due to the Medium Deep fields (see Section . 
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The dependence of right ascension residual biases upon declination shown 
in the top panel of Figure [7] has a significant signature but with lower am- 
plitude that never exceeds 0.08 arcsec. Another significant effect is in the 
dependence of declination residual bias upon declination as shown in the 
bottom panel. There is a relatively constant bias residual of ~ 0.05 arcsec 
from about —30° to 20° declination but then a pronounced trend to negative 
biases as the declination increases to ~ 50°. 

The interpretation of these systematic regional biases would require a 
dedicated study but we may have uncovered regional biases in the 2MASS 
star catalog that are significantly smaller than those of other frequently used 
star catalogs but still relevant when used at the level of accuracy made pos- 
sible by instruments like PS1. In the medium term we expect this problem 
would be removed by debiasing the 2MASS catalog in turn with an even 
better catalog such as the one to be produced by the GAIA mission. 

5.2. Standard deviations and correlations in PS1 data 

To assess the current PS1 astrometry we have used a homogeneous dataset 
processed with the same image processing software and astrom etric reduction 



based upon the 2MASS star catalog. As noted above, the IChesley et al. 



(120101 ) bias model uses 2MASS as the reference catalog so that the PS1 data 
do not need debiasing but the historic data from other observatories have 
been debiased. The attributions to known numbered and multi-opposition 
objects was performed with the KNOWN_SERVER software with uniform 
configuration options such as the quality control parameters. 

The data belong to lunations 138 and 139 from 3/25/2011 to 4/17/2011 
and 4/18/2011 to 5/17/2011 respectively. In each lunation there are tracklets 
generated in two different types of surveys: 1) the 3ir survey including fields 
within about ±30° of the longitude of opposition with four exposures per 
night and 2) the medium deep survey (MD) with eight exposures per visit 
at a few specific bore-sights (the same bore-sight might be visited more than 
once on the same night in different filters). The 3ir opposition fields are 
typically acquired in the q pu rp^ and ipi filters and the MD surveys also use 



the zpi and yp\ filters (see ISchlafly et al.l ( 120121 ) for a detailed description of 



the PS1 filter system). As mentioned above, the 3ir fields cover a wide area 
near opposition while the boresite locations of the 5 MD fields used in this 
work are provided in Table [TJ 

Table [2] gives the basic statistical properties of the residuals for all four 
data subsets. As expected from the discussion in the previous subsection 
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Figure 5: (Top) The middle (thick) curve represents the mean residual in right ascension 
(multiplied by cos 5) as a function of right ascension over 2 degree wide bins. The lower 
and upper curves correspond to ±3 standard errors on the mean. (Bottom) The same for 
the declination residual. 
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Figure 6: (Top) The middle (thick) curve represents the mean residual in right ascension 
(multiplied by cos S) as a function of galactic latitude over 2 degree wide bins. The lower 
and upper curves correspond to ±3 standard errors on the mean. (Bottom) The same for 
the declination residual. 
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Figure 7: (Top) The middle (thick) curve represents the mean residual in right ascension 
(multiplied by cos S) as a function of declination over 2 degree wide bins. The lower and 
upper curves correspond to ±3 standard errors on the mean. (Bottom) The same for the 
declination residual. 
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Tabic 1: Identifiers, borcsite locations and number of reported asteroid observations from 
the PS1 Medium Deep fields used in this work. 



MD Field 


RA (deg) 


Dec. (deg) 


# of obs. 


MD03 


130.591 


44.316 


310 


MD04 


150.000 


2.200 


5742 


MD05 


161.916 


58.083 


41 


MD06 


185.000 


47.116 


287 


MD07 


213.704 


53.083 


6 



Table 2: Mean, RMS and correlation of PS1 right ascension and declination residuals for 
two lunations and two types of PS1 surveys (mean and RMS in arcsec). The last column 
is the number of data points in the measurement. The rows in bold are the results after 
removing the mean (i.e., the standard deviation). 



Survey- 


right ascension 


declination 


number 


Lunation 


Mean 


RMS 


Corr 


Mean 


RMS 


Corr 


resid. 


3tt-138 


0.115 


0.170 


0.759 


0.078 


0.147 


0.691 


67566 




0. 


0.125 


0.542 


0. 


0.125 


0.587 




3tt-139 


0.088 


0.154 


0.609 


0.073 


0.149 


0.614 


44111 




0. 


0.127 


0.420 


0. 


0.129 


0.480 




MD-138 


0.035 


0.120 


0.305 


0.014 


0.120 


0.278 


3533 




0. 


0.115 


0.242 


0. 


0.119 


0.269 




MD-139 


0.038 


0.112 


0.332 


0.019 


0.109 


0.340 


3047 




0. 


0.106 


0.257 


0. 


0.107 


0.323 





the bias in both RA and DEC is sensitive to the sky location and the RMS 
(computed with respect to zero) changes significantly as a consequence. The 
RMS computed with respect to the mean is less sensitive but it too can 
change within a range of 0.02 arcsec. 

We also computed the correlations among the residuals of the same co- 
ordinate that belong to the same tracklet. They have a large spread that 
depends on the observed region in the sky and also upon the removal of the 
constant bias (the mean of the residuals). This is not a surprise since if the 
data are processed ignoring the presence of a bias then the bias reappears 
as a correlation. The only unexpected result is the size of this effect with 
apparent correlations up to 0.76 in RA and 0.69 in DEC. In the 3° wide MD 
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field the overall bias is much less because the target area is subject to low 
biases in the 2MASS catalog and the removal of the overall mean is almost 
equivalent to a direction-sensitive debiasing. This results in in-tracklet cor- 
relations in the range 0.24 to 0.32. These values are a good measure of the 
intrinsic correlation of data taken over a short time span (such as data from 
the same tracklet). 

These results imply that we are not yet ready to build a PS1 error model 
that fully accounts for biases and correlations because these two effects can- 
not yet be neatly separated. To use all the information contained in the 
precise PS1 observatio ns we would need to first apply a 2MASS catalog 



( jSkrutskie et al.l . 120061 ) debiasing, measure the correlation, and then build a 
PSl-spe cific error mod e l and explicitly account for the correlations (as was 
done bv lCarpino et al.l (l2003h : iBaer et all fl201l[ )l Since this is a significant 



undertaking it needs to be the subject of our future work and we need to 
find a method to handle the data in a statistically correct manner. 

5.3. Indications for a Pan-STARRS error model 

There are two main conclusions to be drawn from the analysis of the PS1 
data. 

First, the precision of the PS1 astrometry is very good. The random 
component of the error model for the residuals in both RA and DEC has 
a standard deviation ranging between 0.106 and 0.129 arcsec (see the bold 
lines in Table [2]). This is reasonable taking into account the typical PS1 
point spread function (PSF) full- width at half-max of ~ 1.1 arcsec, the pixel 
scale of ~ 0.26 arcsec, that asteroids move slightly during the PS1 expo- 
sures, and that the vast majority of the reported PS1 asteroid detections 
have a signal-to- noise ratio (SNR) in the range of 5-10 (i.e., for over-sampled 
PSFs we expect the astrom e tric p recision to be ~ FWHM/(2A x SNR), 
Neuschaefer and Windhorstl ril995j )). Only a few professional asteroid ob- 



servers regularly achieve this astrometric accuracy and of the large surveys 
only the Sloan Digital Sky Survey appears to have reached such precision. 
The PS1 advantage is that it achieves this accuracy over the entire sky during 
a long term survey and will generate a huge volume of data. In 2010 PS1 was 
the fifth largest contributor of numbered asteroid observations to the MPC 
with 548, 518 (in the first 11 months of 2011 it was fourth with 935, 001). 
Second, the accuracy of the PS1 astrometry is limited by the presence of 



syste matic errors in the reference star catalog. Although 2MASS (jSkrutskie et al 



20061 ) is the best possible choice for a star catalog it still contains systematic 
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effects at up to the level of 0.1 arcsec that are easily detectable with the mod- 
ern generation of sky surveys. As a consequence, it is not yet possible to fully 
capitalize on the astrometric precision available from surveys like PS1. The 
problem could be solved either with new and improved star catalogs (e.g., 
the ones expected from the GAIA mission) or by building a survey-specific 
PS1 star catalog that is referenced to a catalog with smaller regional biases 
such as Tycho-2 or Hipparcos. In addition to the position-dependent biases 
there is also an intrinsic correlation among the observations belonging to the 
same tracklet that we are not able to model accurately at this time. 

To ensure that the PS1 observations yield the best possible nominal so- 
lutions and reliable covariance estimates the PS1 residuals must be weighted 
in a manner that accounts for systematic errors and their correlations. That 
is, if we use a model that ignores biases and correlations the confidence re- 
gion of the observations of the same tracklet needs to be a sphere containing 
the ellipsoid representing the confidence region of the debiased and corre- 
lated model. If we assume the correlated covariance matrix of the RAs of 
the tracklets with m observations has variances a 2 on the diagonal and all 
correlations are equal to r (i.e., the covariances are all r a 2 ) th en the largest 
eigenvalue is (1 + (m — 1) r) • a 2 ( iMilani and Gronchil . l2010l Section 5.8). 
With m = 4, r = 0.759, a = 0.170 the maximum eigenvalue is 0.308 2 . Thus, 
the worst case from Table [2] requires a weight of 1/0.308 arcsec -1 . Indeed, 
AstDyS uses a weight of 1/0.3 to process the PS1 data — the highest weight 
for any asteroid survey. Some asteroid programs with either special instru- 
ments/methods or very labor intensive reduction procedures do have higher 
weights. 

We are not claiming that this weighting scheme is the best or only solution 
for handling high precision data from modern surveys — only that it is a 
prudent way to use the PS1 data while waiting for a more sophisticated error 
model that could be obtained by a better bias removal and by an explicit 
correlation model. 



6. Accuracy and efficiency 

The purpose of this section is to measure the accuracy and the efficiency 
of our asymmetric attribution procedure. Accuracy is the fraction of correct 
attributions among those proposed by our method while efficiency is the 
fraction of possible attributions that were found. 
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6.1. Accuracy test 

It is very difficult to measure accuracy with real data because it is impos- 
sible to know the ground truth (i.e., which attributions are true/false) while 
measuring accuracy with simulated data would be less convincing because it 
is difficult to simulate all possible causes of false data (at the single detec- 
tion, tracklet, and attribution level). In any event, false data tend to generate 
tracklets with randomly oriented angular velocity that are less likely to be 
attributed to a real asteroid orbit. We therefore consider an attribution test 
with real tracklets and real objects to be more valuable at least for the same 
number of tracklets under consideration. 

Our solution has been to use the fact that our dataset of proposed attri- 
butions can be split in two disjoint subsets: numbered and multi-apparition 
asteroids. The set of all attributions to numbered objects is robust and 
contains only a very small fraction of false attributions. If a tracklet has a 
successful attribution to a numbered asteroid then it cannot be attributed to 
a different asteroid. Thus any tracklet that is already attributed to a num- 
bered asteroid would almost certainly be a false attribution if it could also 
be attributed to a multi-apparition asteroid. 

In our test we used the set of 13, 729 tracklets attributed to numbered 
asteroids in lunation 139 of the PS1 37r survey and attempted to find attri- 
butions to a list of 140, 225 multi-apparition orbits using our algorithm for 
the asymmetric case. The multi-apparition orbits provide much less accurate 
ephemerides and finding an observation inside the confidence ellipse of one 
of the many tested orbits is not a rare event. The first filter provided 44, 675 
candidate attributions, the second filter reduced the number to 8, and the 
third yielded 3 tracklets incorrectly attributable to two asteroids: 2010 GU104 
(2 tracklets) and 2010 GK 2 (1 tracklet). 

For comparison, our algorithm attributed 3, 619 tracklets to 2, 950 multi- 
apparition asteroids in the same lunation after removing all the tracklets that 
were already attributed to numbered asteroids. Thus, the false attribution 
rate can be estimated at either 3/3,619 or 2/2,950 which is < 1/1,000 per 
lunation per observable object. This result is statistically good but cannot be 
ignored because such false attributions would introduce permanent 'damage' 
in the orbit database^]. 



5 False facts are highly injurious to the progress of science, for they often endure long..., 
C. Darwin, The Origin of Man, 1871. 
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The false attribution rate can be reduced to zero through additional con- 
trols. The two asteroids 2010 GU104 and 2010 GK 2 for which false attri- 
butions were found have a historical set of observations that only weakly 
constrain the orbit: two apparitions widely spaced in time (1999-2010 and 
2001-2010 respectively) and not many observations (16 and 18 respectively). 
Thus, the 1 a ephemeris confidence ellipses at the time of the incorrect attri- 
bution have major semiaxes of about 47 and 81 arcmin, respectively. This 
explains why it was possible to find a false attribution passing the quality 
controls in a sample of > 13, 700 tracklets. Moreover, in the first case the 
two attributed tracklets were from a single night and in the second there was 
only one tracklet. Thus, another filter should not accept single-night attri- 
butions to objects that have been observed in only two previous apparitions. 
As a matter of fact, the Minor Planet Center has enforced this rule for a 
long time and we acknowledge that this rule is a meaningful caution to avoid 
contamination of the orbits database by spurious attributions such as the 
ones found in our test. 

6.2. Efficiency test 

Determining the efficiency of our attribution method suffers from the 
same concerns addressed above regarding the use of real or synthetic data. 
Once again, we decided to rely on real data and used the AstDyS catalog of 
numbered and multi-apparition asteroids to identify a set of asteroids whose 
ephemeris definitely places them in each field of view. The PS1 limiting 
magnitude is fainter than most known asteroids when they are at opposition 
so we used objects identified in one night of morning sweet spot observations 
(looking eastwards before sunrise at solar elongations between 60° and 90° 
in the w filter). Main belt asteroids are fainter in the sweet spots than at 
opposition due to distance and phase angle effects so that their range in 
apparent magnitude is better suited to determining the PS1 sensitivity as a 
function of the objects brightness. 

Then we ran KNOWN_SERVER using the same catalog of orbits and all 
the observed tracklets in the same fields to identify objects in the list of those 
which could have been observed because they were in the field of view of the 
PS1 sensor. The efficiency is simply the fraction of observable objects that 
were actually detected. If the false attribution rate is < 1/1, 000 as discussed 
in the previous subsection the statistics of the successful attributions will not 
be significantly contaminated. 
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Figure 8: Tracklet attribution efficiency to well known asteroid orbits as a function 
of predicted apparent V magnitude from one night of the PS1 survey. Each data 
point represents 150 observable objects. The horizontal bars indicate the range of 
values within the bin and the vertical bars represent one standard deviation of the 
estimated efficiency in the bin. 



Figure [8] shows that the attribution efficiency as a function of predicted 
apparent V magnitude has a sharp decline at the limiting magnitude of ~ 21 
(where the efficiency drops to half the peak value). The peak efficiency of 
tmax = 0.78 ± 0.04 occurs at V > 19 but decreases by less than a standard 
deviation in the brightest bin near V ~ 18 to 0.74 ± 0.04. Note that the 
well known offset of a few ten ths of a m a gnitu de between predicted and 
actual asteroid V magnitudes (IJuric et all 120021 ) is unimportant — all we 
care about here is the peak efficiency independent of magnitude. 

The problem in interpreting these results is that they measure several 
different contributions to the efficiency including the fill factor (f, the frac- 
tion of the focal plane covered by active sensing pixels), the detector sensi- 
tivity (cd), the efficiency of the image processing pipeline (IPP) in detect- 
ing moving objects (erpp), the Moving Object Processing System's (MOPS, 
Kubica et all 120071 ) efficiency at linking detections into tracklets (chops), 
and the efficiency of the attribution algorithm (e a ttrib)- Disentangling each 
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effect as a function of V is difficult but unnecessary for our purposes. In- 
stead, we know that the generic efficiency e x > e max and attempt to establish 
a minimum value of e a ttrib- 

The PS1 camera consists of 60 orthogonal transfer array (OTA) CCDs 
arranged in an 8 x 8 array (the four corners do not have CCDs) and each OTA 
contains an on-chip 8x8 mosaic of 'cells'. The camera fill factor includes the 
physical gaps between OTAs, smaller gaps between cells on a single OTA, 
area lost to defective cells and bad pixels, and the overlap between the sensor 
and the Field Of View (FOV) of the optical system. There are additional 
losses of the order of 1% specific to individual exposures due to a 'dynamic 
mask' applied to remove bright diffraction spikes and internal reflections. 

The PS1 image processing team produces periodic analysis of sensor fill 
factor as the camera tuning improves. The most recent study (May 2010) 
yields / = 0.79 due to a 7.0% loss due to inter-OTA gaps, a 4.3% loss due to 
inter-cell gaps, and 9.7% mask fraction in the UN-vignetted 3.0-degree FOV. 
For our KNOWN_SERVER efficiency study we only consider asteroids that 
should appear in the 3.0-degree FOV so / = 0.79 is applicable. 

The fill factor estimate implies that the peak efficiency (at magnitudes 
between 19 and 20) leaves little room for losses due to the IPP, MOPS, and 
attribution software. Each of the processing steps has a minimum efficiency 
of 0.981q;o5- I.e., the overall efficiency is dominated by the fill factor and the 
efficiency of the other steps including attribution is consistent with 100%. 
The apparent drop in the efficiency for brighter apparent magnitudes is not 
significant. It might indicate some problems in the image processing and 
accurate astrometric reduction of bright detections. The PS1 CCDs saturate 
at about w = 15.9 and the iw-filter is used for sweet spot observations. 

7. Conclusions and future work 

We proposed a new procedure to identify known asteroids among the 
new observations from a survey. This procedure solves the problems due 
to the asymmetry in quantity and quality between the data and historic 
observations. 

We tested the procedure with real data from the Pan-STARRS PS1 survey 
and assessed the performance of our algorithms and the astrometric accuracy 
of the survey data. The main results are the following. 

First, the new algorithms are accurate and efficient. For multi- apparition 
asteroids the false attribution fraction is less than 1/1, 000 and even those can 
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be eliminated by following the MPC's good practice of requiring two nights 
of data for a recovery at a new apparition. The algorithm's attribution 
efficiency is high and consistent with 100% but cannot really be measured 
because it is entangled with other efficiency losses such as the fill factor. 

Second, the PS1 data have significantly lower astrometric error than other 
asteroid surveys. This error can be identified only after removing the biases 
due to systematic errors in the star catalogs. Indeed, we arrived at the 
conclusion that even the 2MASS star catalog contains enough biases to affect 
the PS1 error model. It is likely that the use of other catalogs, or at least 
debiasing with respect to them, would further improve the PS1 error model; 
e.g., Tycho-2 could be used now and the GAIA catalog in the future. 

Third, we consider that the PS1 data can be included in the fit for as- 
teroid orbits with weights corresponding to 1/0.3 arcsec -1 that implicitly 
account for the effects of correlations. A model with debiasing of the 2MASS 
astrometry and explicitly taking into account the correlations could allow 
a further improvement in the diagonal elements of the weighting matrix by 
another factor 2 ~ 3. 



7.1. Future work 

Apart from the improvements mentioned above there are three areas 
where there is room for future developments. 

The first is in the use of KNOWN_SERVER, or similar algorithms and 
software, as a filter to remove observations of known objects from a new set 
of data to leave subset with a larger fraction of potential new discoveries. 
Although we think that this principle is valid we have not yet been able to 
test the idea on real data. For the removal of the known objects to be a 
significant contribution in decreasing the false identification rate, the false 
tracklet rate must be below some threshold and the number of known objects 
must be large compared to the number of unknown objects at the system's 
limiting magnitude. For the current PS1 data the false discoveries due to 
spurious detections (that form false tracklets) are more important than the 
false discoveries due to incorrect linking of true tracklets. 

The second is the possibility of using KNOWN_SERVER as an alarm to 
detect unusual phenomena in well known as teroids. The mos t inter esting; 
cases could be the Main Belt Comets (MBC, iHsieh and Jewittl . 120061 ). 

As an example, the numbered asteroid (300163) was found to be a MBC 
on the basis of PS1 observations t hat showed an image wider than the PSF of 
nearby stars (IHer genrotherl . 120 1 ll ) . However, the standard KNOWN_SERVER 
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output automatically flags these observations as very unusual in at least 3 
different ways. Two of the strange flags are astrometric in nature: BIAS a = 
4.01, corresponding to a systematic shift by 0.85 arcsec backward (with re- 
spect to orbital motion), and BIASs = 2.11, corresponding to a shift by 
0.44 arcsec North. These could be interpreted as an effect of displacement 
of the center of light with respect to the center of mass, and/or as an effect 
of non-gravitational perturbations. The other strange flag was that the ob- 
servation's apparent magnitudes were on average brighter than the predicted 
ones by about 1.05 which is also likely a consequence of the outburst. 

The problem is to define a filter selecting anomalous behaviors such as 
the one above as an indication of possible orbital, luminosity and/or image 
shape changes. The main challenge is to design such a filter with a low false 
positive rate that allows for dedicated follow up of the MBC candidates. 

The third is the possibility of improving the quality of other existing 
large datasets (such as those generated by other sky surveys or even the 
MPC asteroid astrometry database) by either detecting missing attribution 
or proposing deletion of dubious ones. This work would require the collabo- 
ration of the data providers for their insight on the datasets' problems. 
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